A Faster Grammar-Based Self-index
نویسندگان
چکیده
To store and search genomic databases efficiently, researchers have recently started building compressed self-indexes based on grammars. In this paper we show how, given a straight-line program with r rules for a string S[1..n] whose LZ77 parse consists of z phrases, we can store a self-index for S in O(r + z log log n) space such that, given a pattern P [1..m], we can list the occ occurrences of P in S in O ( m + occ log log n ) time. If the straight-line program is balanced and we accept a small probability of building a faulty index, then we can reduce the O ( m ) term to O(m logm). All previous self-indexes are larger or slower in the worst case.
منابع مشابه
Fast and Tiny Structural Self-Indexes for XML
XML document markup is highly repetitive and therefore well compressible using dictionary-based methods such as DAGs or grammars. In the context of selectivity estimation, grammar-compressed trees were used before as synopsis for structural XPath queries. Here a fully-fledged index over such grammars is presented. The index allows to execute arbitrary tree algorithms with a slow-down that is co...
متن کاملComparing confidence-based and conventional scoring methods: The case of an English grammar class
This study aimed at investigating the reliability, predictive validity, and self-esteem and gender bias of confidence-based scoring. This is a method of scoring in which the test takers receive a positive or negative point based on their rating of their confidence in an answer. The participants, who were 49 English-major students taking their grammar course, were given 8 multiple-choice tests d...
متن کاملOnline Self-Indexed Grammar Compression
Although several grammar-based self-indexes have been proposed thus far, their applicability is limited to offline settings where whole input texts are prepared, thus requiring to rebuild index structures for given additional inputs, which is often the case in the big data era. In this paper, we present the first online self-indexed grammar compression named OESP-index that can gradually build ...
متن کاملHow Attitude, Self-efficacy, and Job Satisfaction Relate with Teaching Strategies?
The primary purpose of the present study was to explore whether there was any significant relationship between attitude, self-efficacy, and job satisfaction of Iranian EFL teachers on the one hand, and their choice of teaching strategies. Strategies mostly used by participants of the study with low, mid, and high levels of self-efficacy comprised another purpose of the study. To this end, a que...
متن کاملSelf-Indexed Grammar-Based Compression
Self-indexes aim at representing text collections in a compressed format that allows extracting arbitrary portions and also offers indexed searching on the collection. Current self-indexes are unable of fully exploiting the redundancy of highly repetitive text collections that arise in several applications. Grammar-based compression is well suited to exploit such repetitiveness. We introduce th...
متن کامل